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Abstract 

We propose a new and easy-to-use method for identifying cointegrated components of 
nonstationary time series, consisting of an eigenalysis for a certain non-negative definite matrix. 

Our setting is model-free, and we allow the integer-valued integration orders of the observable 
series to be unknown, and to possibly differ. Consistency of estimates of the cointegration 
space and cointegration rank is established both when the dimension of the observable time 
series is fixed as sample size increases, and when it diverges slowly. A Monte Carlo study 
of finite-sample performance, and a small empirical illustration, are reported. Asymptotic 
justification of the method is also established in a fractional setting. 
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1 Introduction 


Cointegration entails a dimensionality reduction of certain observable multiple time series that are 
dominated by common components. In particular a multiple time series can be said to be (linearly) 
cointegrated if there exists an instantaneous linear combination, or cointegrating error, with lower 
integration order. Much of the vast literature, following Box and Tiao (1977), Granger (1981), 
Engle and Granger (1987), has focused on unit root series which have one or more short memory 
cointegrating errors, but there have been extensions to nonstationary series with other integer 
orders of integration, allowing also for the possibility of some nonstationary cointegrating errors, 
as well as to fractional nonstationary, and even stationary, observable series and cointegrating 
errors, with unknown integration orders. Much of the early literature, in particular, assumed a 
complete parameterization of second order properties, where in particular the observable series 
are generated from short memory inputs that have finite autoregressive moving average (ARMA) 
structure, but it has also been common to study semiparametric settings, with underlying short 
memory inputs having nonparametric autocorrelation, see e.g. Stock (1987), Phillips (1991), in 
some cases without sacrificing precision relative to a correctly specified parametric structure. 

Given knowledge of the cointegration rank, r, of a p-dimensional observable series, that is 
the number of cointegrating relations, various methods are available for estimating the unknown 
parameters of the model, such as the coefficients of the cointegrating errors, and even of unknown 
integration orders, and for carrying out asymptotically valid, and sometimes even efficient, sta¬ 
tistical inference. However, r might not be known to the practitioner, and various approaches for 
estimating r from the data have been developed, starting from Engle and Granger (1987), Jo¬ 
hansen (1991), in their parametric, unit root vector autoregressive (VAR) setting, and continuing 
with, for example, Aznar and Salvador (2002) and Saikkonen and Liitkepohl (2000). If, however, 
the order of the VAR is underspecified, or all observable series do not have a single unit root, 
then typically the resulting specification error will invalidate such approaches, not to mention 
rules of statistical inference on unknown coefficients in the model. It is possible that one or more 
of the nonstationary observable processes could have two or more unit roots, or indeed could 
have fractional orders of integration, as supported by some empirical investigations. References 
that allow for nonparametric autocorrelation and/or unknown integration orders include Phillips 
and Ouliaris (1988, 1990), Stock (1999), Shintani (2001), Harris and Poskitt (2004), Li, Pan and 
Yao (2009) in the case of integer integration orders, and Robinson and Yajima (2002), Ghen and 
Hurvich (2006), Robinson (2008) in case of fractional integration orders, including in the latter 
setting cases where observables are stationary and the cointegrating errors are stationary with 
less memory. 
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Like Phillips and Ouliaris (1988), Robinson and Yajima (2002), Harris and Poskitt (2004), 
Li, Pan and Yao (2009), we employ methods based on eigenanalysis. In our case, in the setting 
of nonparametric autocorrelation and unknown (and possibly different) integration orders, we 
employ eigenvalues of a certain non-negative definite matrix function of sample autocovariance 
matrices of the observable series, for estimating cointegration rank, with the cointegration space 
then estimated by selection of eigenvectors, and cointegrating errors thereby proxied. Though the 
initial development assumes that observable series have integer orders and cointegrating errors 
have short memory, we extend these results to allow for observables to be fractionally nonsta¬ 
tionary, and cointegrating errors to be fractionally stationary. In both circumstances we establish 
consistency of our estimates of cointegration rank and space with p is fixed as the length of our 
time series, n, diverges. In case of integer integration orders, we also establish consistency allowing 
p to diverge slowly with n. 

The rest of the paper is organized as follows. The proposed methodology is presented in Section 
2. Asymptotic theory with integer order of integration is developed in Section 3. Simulations and 
a small real data are reported in Section 4. In Section 5, both the proposed method and part of 
the asymptotic theory are extended to the fractional case. All statements and proofs are relegated 
to an Appendix, which also contains a number of technical lemmas. 

2 Methods 

2.1 Setting 

We call a vector process weakly stationary if (i) Eut is a constant vector independent of t, 
and (ii) i?||ut|p < oo, and Cov(uj, u^+s) depends on s only for any integers t,s, where || • || 
denotes the Euclidean norm. Denote by V the difference operator, i.e. Vut = — uj_i, and 

V'^Ui = V(V'^“^ut) for any integer d > 1. We use the convention = uj. Further, if uj has 

spectral density matrix that is finite and positive definite at zero frequency we say is an I (0) 
process. An example of an I (0) process is a stationary an invertible vector ARMA, and many I (0) 
processes satisfy Condition 1 of Section 3.1 below, imposed for our asymptotic theory, including 
the examples described immediately after Condition 1. Now denote by uu the ith element of Uf 
and define = uul (t > 1), where 1 (•) is the indicator function. For an m-dimensional I (0) 
process and non-negative integers we say that v* = (V“‘^W^,..., is an 

(m-dimensional) I (di, ■■.,dm) process, with some abuse of notation when m = 1, di = 0. Note 
that for di = ... = dm = 0, vt is not I (0) or even weakly stationary or equivalent to Ut due to the 
truncation (implying vt = 0, t < 0) that is imposed in order to achieve bounded variance in case 
of positive di, but it is ‘asymptotically’ weakly stationary and I (0). When di = ... = dm = 1, all 
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elements of vt have a single unit root, but we are concerned with processes for which di can vary 
over i. 

Now assume a p x 1 observable time series yt is I {di, ■.■,dp) for non-negative integers, and 
admits the following form 

yt = Axt, (2.1) 

where A is an unknown and invertible constant matrix, xt = (x(]^,x( 2 )^ is a latent px 1 process, 
Xt 2 is an r X 1 /(O) process, and x^i is an I (ci, ...,Cp-r) process, where each Cj is an element of 
the set {di ,..., dp} . Furthermore no linear combination of x^i is /(O), as such a stationary variable 
can be absorbed into xt 2 - Each component of xt 2 is a cointegrating error of yt and r > 0 is 
the cointegration rank. In the event that there exists no cointegration among the components 
of yt, r = 0. When yt itself is 7(0, ■■■ ,0), r = p. But these are two extreme cases. Note that 
cointegration requires equality of at least two dj. For many economic and financial applications, 
there exist a small number of cointegrated variables, i.e. r > 1 is a small integer. 

Note that A and xt in (12.11) are not uniquely defined, as (A, xt) can be replaced by (AH“^, Hx*) 
for any invertible H of the form 



where Hii,H 22 are square matrices of size {p — r), r respectively, and 0 denotes a matrix with 
all entries equal to 0. Therefore there is no loss of generality in assuming A to be orthogonal, 
because any non-orthogonal A admits the decomposition A = QU, where Q is orthogonal and 
U is upper-triangular, and we may then replace (A,xt) in (j2.ip by (Q,Uxt). In the sequel, we 
always assume that A in (j2.ip is orthogonal, i.e.. A'A = Ip, where Ip denotes the p x p identity 
matrix. Write 

A = (Ai,A2), 

where Ai and A 2 are respectively, p x [p — r) and px r matrices. As now xt 2 = A^yt, the linear 
space spanned by the columns of A 2 , denoted by Af(A 2 ), is called the cointegration space. In 
fact this cointegration space is uniquely defined by though A 2 itself is not. 

2.2 Estimation 

The goal is to determine the cointegration rank r in (12.ip and to identify A 2 , or more precisely 
Af (A 2 ). Then Af(Ai) is the orthogonal complement of Af(A 2 ), and xtt = A[yt for i = 1, 2. Our 
estimation method is motivated by the following observation. For j > 0, let 
^ 1 1 ” 
t=i t=i 
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For any a G Ai{A 2 ), a'S^a is the sample autocovariance function at lag j for the weakly stationary 
univariate time series a'yj, and it converges to a finite constant (i.e. the autocovariance function 
of a'yt at lag j) almost surely under some mild conditions. However for any a ^ Ai{A 2 ), a'yt is 
a I{d) for some d > 1, and 

a'Sj-a = Oe{n^^~^) or Oe(n^'^), (2.2) 


depending on whether E{a'yt) = 0 or not, see Theorems 1 & 2 of Pena and Poncela (2006). In the 
above expression, where U = Oe{V) indicates that P{Ci < |C//P| < C 2 ) ^ 1 as n ^ 00 , where 
C 2 > Cl > 0 are two finite constants. Hence intuitively the r directions in the cointegration space 
A4{A2) make |a'Sja| as small as possible for all j > 0 . 

To combine information over different lags, define 

jo 

W = ^%S;, (2.3) 

j=o 

where jo > 1 is a prespecified and fixed integer. We use the product instead of 'Sj to ensure 

each term in the sum is non-negative definite, and that there is no information cancellation over 
different lags. Note that a'llja = Oe(l) if a G A4{A2), and is at least of the order of if 

a G Ad(Ai). Hence intuitively Ai{A 2 ) should be the linear space spanned by the r eigenvectors 
of W corresponding to the r smallest eigenvalues, and Aj(Ai) is that spanned by the {p — r) 
eigenvectors of W corresponding to the {p — r) largest eigenvalues. This point can be further 
elucidated as follows. Let ( 7 ^^, • • • , 7 ^) be the orthonormal eigenvectors of W corresponding to 
the eigenvalues arranged in descending order and 


A = (Ai, A 2 ) = (7 


1 )' 


.7. 


then 


JO 


A'WA = ^(A'SjA)(A's'.A) 

j=o 



A'aWAs 



a;e,Aia;%.Ai + a;s, A2A'%Ai 


(2.4) 


A' 2 SjA 2 A' 2 S As A' 2 %AiA'^S ■A 2 


The (1, l)-th block on the RHS is dominated by A'j^Sj AiA'^Sj Ai. The (2, 2)-th block con¬ 
sists of two lower order terms, and is dominated by A^^lj A 2 A 2 SJ A 2 as 

A' 2 SjAi A']^5 ]jA2 = Op(l) (since jo is fixed). Consequently, we estimate A and xt by 


A = (Ai,A 2 ), and xj = (A'^yt, A'gyt). 


(2.5) 
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The idea using an eigenanalysis based on a quadratic form of sample autocovariance matrices 
has been used for factor modelling for dimension reduction (Lam and Yao 2012, and references 
within), and for segmenting a high-dimensional time series into several both contemporaneously 
and serially uncorrelated subseries (Chang et al. 2014). One distinctive advantage of using the 
quadratic form SjXlj instead of in (12.31) is that there is no information cancellation over 
different lags. Therefore this approach is insensitive to the choice of jo in (|2.3I) . Often small 
values such as jo = 5 are sufficient to catch the relevant characteristics, as serial dependence is 
usually most predominant at small lags. Using different values of jo hardly changes the results; 
see Tabled] in Section 4 below, and also Lam and Yao (2012) and Chang et al. (2014). 

2.3 Determining cointegration ranks 

The components of = A'yt = (x},--- ,x^)', defined in (12.5I) . is arranged according to the 
descending order of the eigenvalues of W. Therefore, the order of the components reflects reversely 
the likeness of the stationarity of those component series, with {xf} most likely being a stationary 
cointegrating error series. Hence the unit-root tests (Phillips and Ouliaris, 1988) can be applied 
to each of the component series {xf}, • • • to determine the cointegration rank r. Below 

we propose some alternative criteria to determine r. 

Let Al > • • • > Ap > 0 be the eigenvalues of W. By (j2.4p and (j2.2l) . A* is at least of the order 
of for all 1 < i < p — r, and Aj = Op(l} for all p — r < i < p. Hence as long as 1 < r < p, 
Xi/{nXp) —)• oo in probability for all 1 < z < p — r, and Xi/{nXp) = Op{l) for all p — r < z < p. 
This leads to estimating r by 

f = max{j ; Xp+i-j/{nXp) < 1 , 1 < j < p}. ( 2 . 6 ) 

See also Lam and Yao (2012) and Ahn and Horenstein (2013) for procedures based on eigenvalue 
ratios for factor models. 

Alternatively we may define a simple information criterion as follows 

i 

^ ^ ^P+l—J T (p 
i=i 

where iXn oo, —)• 0 in probability (as we allow ujn to be data-dependent), and dmin 

is the smallest integration order among all the components of y^. Then r can be estimated as 

T = arg min IC{1). (2-7) 

i<i<P 

Note that when ujn = nXp, it holds that r = r. 
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3 Asymptotic Properties 

In this section, we investigate the asymptotic properties of the proposed statistics. First, we show 
that with r given, the linear space At(A 2 ) is a consistent estimator for the cointegration space 
At(A 2 ). We measure the distance between the two spaces by 

D(M(A2),M(A2)) = - ^tr(A2A'2A2A'2). (3.1) 

Then T)(A1 (A 2 ), A1(A2)) G [0,1], being 0 if and only if Ai{A 2 ) = Ai{A 2 ), and 1 if and only 
if M{A 2 ) and Ai{A 2 ) are orthogonal. Furthermore, we show that both the estimators r and r, 
defined respectively in (12.611 and (12.7p . are consistent for the cointegration rank r. We consider 
two asymptotic regimes: (i) p is fixed while n —>■ 00 , and (ii) p ^ 00 more slowly than n. In this 
section we always assume that jo in (12.3p is a fixed positive integer. 

Put xti = {xj, ■ ■ ■ Under (12.ip . x{ is I{dj) for 1 < j < p — r and z{ = V^^x{ is 1(0), 

where dj > 1 is an integer. Write zt = {zj:,--- ,z^~^y and £t = (z(,x( 2 )^ Denote the vector of 
partial sums of components of £t by 

^ D*i] , DUl 

\/n 

where 0 < U < • • • < fp < 1 are constants and t = (ti, • • • , tp)'. 

3.1 When n ^ 00 and p is fixed 

We introduce a regularity condition first. 

Condition 1. 

(i) There exists a Gaussian process W(t) = (lU^(fi), • • • , lU^(tp))^ such that as n ^ 00 , 

S„(t)=^W(t), on DP(0,1), 

where denotes weak convergence under Skorohod Ji topology (Chapter 3 in Billingsley 
1999), and W(l) has a positive definite covariance matrix = (cJijj. 

(ii) The sample autocovariance matrix of xt 2 satisfies 

-y n-j 

sup II- 'y^(xt+j ,2 - X2)(X42 - X 2 )' - Cov(xi+j, 2 ,Xl, 2)||2 ^ 0, 

0<j<jo 

where ||H ||2 = sup|| 3 ||^;^ l|H^II is the L 2 -norm of matrix H, X 2 is the sample mean of Xf 2 , 
P 

and —)• denotes convergence in probability. 




-Ee\), 


1=1 


'' 1=1 


,5P(g)'= ( 
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Condition 1 is mild. It is fulfilled when {st} is weakly stationary with det(Var(£t)) / 0, 
E||et|p'>' < C for some constants 7 > 1 and C < 00 , and {st} is also a-mixing with mixing 
coefficients am satisfying the condition '^m=i Theorem 3.2.3 of Lin and Lu 

(1997). It is also fulfilled when £t = where are i.i.d. with non-singular covariance 

matrix and < 00 for some constant 7 > 1, and det(^^Q Cj) 7 ^ 0, ||Cj|| < 00 . See 

Fakhre-Zakeria and Lee (2000). 

Theorem 1. Let r be known. Under Condition 1, D{Ai[A 2 ),Ai{A 2 )) = Op(l). Furthermore, 

(i) D{M-{A 2 ), M-{A 2 )) = provided either (a) |/o| >2 or (b) |/o| = 1 and = 

0 , and 

(a) D{M{A 2 ),Ai{A 2 )) = provided |/o| = 1 and Ez/° / 0, 

where dmin = mini<j<p_r d*, Iq = ^ ~ I{dm\n)-, 1 < i < p — r} and |/o| denotes the number 

of elements in Iq. 


Theorem 2. Let 1 < r < p and Condition 1 hold, 
(i) For r defined in 112.6\) . lim„_>.ooP(r = r) = 1. 


(a) For r defined in ^2. 7|), lim„_>.ooP(r = r) = 1 provided 1/uJn + 0Jn/n‘^ 


= Opil). 


3.2 When n —?■ 00 and p —)■ 00 , p = 0(n‘^) 

We extend the asymptotic results in the previous section to the cases when p —>• 00 and p = 0{n^) 
for some c G (0,1/2). Technically we employ a normal approximation method to establish the 
results. See Condition 2 (i) below. 

Condition 2. 


(i) Suppose that the components of zt are independent and Ez^ = 0. For each component {zD 
of Zt, there exists an independent and standard normal sequence {t'/} for which as n ^ 00 , 


[nt] 


sup sup E 

l<i<p-r0<t<l 




(3.2) 


S=1 


where 0<r<l/2isa constant, 61 < = lim^^oo Var z'g) fn < 62 for any i, and 

bi, 62 are two positive constants. 


(ii) The sample autocovariance matrix of xt 2 satisfies 


^ n-j 

-I^(Xi+L2-X2)(xt2 -X 2 )' 
t=i 


Cov(xi+j'2,Xi^2) 


sup 

0<j<jo 


2 


P 


0 . 








(iii) Suppose {zt} and {xf 2 } are independent and for r given above 

n 

sup |E(e^ef)l = 

P-r<3<Vs,t=l 

Remark 1. The inequalities in the line below (3.2) will hold if the zt’s are 1(0) with spectral 
density continuous at zero frequency, because the variance is proportional to the Cesaro sum of 
the Fourier series of the spectral density at zero frequency, and thus converges to the latter (which 
is positive and finite under 1(0)) after normalization. 

Remark 2. When integration orders of all nonstationary components are the same, the indepen¬ 
dence assumption in Condition 2(i) can he relaxed and replaced by zj = Be^, where B is a p x m 
constant matrix, m > p — r, all the components of et = (e), - , e™)' are independent, and {el} 

satisfies i3.2}} for 1 < i < m. 

Remark 3. Let p = Condition 2 is implied by any of the three assertions below. 


(i) The components of et are independent of each other, and each component series {ej} is a 
martingale difference sequence with sup;^<j<pE|eJ|'? < oo for some q > 2. Furthermore, for 
some 2 < q* < min{4, q}, 


sup E 

l<i<p 


n 


ElK)" 





(ii) The components of St ire independent, Est = 0, and max E|eJ|'^ < oo for some k > q £ 

l<i<p 

(2,4]. The process {et} is a-mixing with mixing coefficients Om satisfying 

OO 

Y < oo. (3.3) 

m=l 


(iii) The components of St are independent. Each component el satisfies the following conditions. 


(a) There exists an i.i.d random sequence {rf} such that 


4 = Y^i^t-r 

j=0 

(b) Eel — 0, EjeJl'? < oo for some q > 2 and YlJLo j\cij\ < oo. 


Theorem 3. Let r be known and Condition 2 hold. If p = oinf!’^ ’’) and r given in Condition 
2, it holds that 

D(M(A2),M(A2)) = 

where X* is the smallest eigenvalue of F(t)F'(t) dt defined in Lemma 9 in Section^ below. 
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Remark 4. Theorem\^ is derived under the condition p = o{n^^‘^~'^), while there are no direct 
constraints on either r or p — r. However when p — r is fixed, F(t}F'(t) dt is a (p — r) x {p — r) 
positive definite matrix, and, hence, X* is positive and Oe(l). When the integration orders of all 
the nonstationary components are the same and equal to dmin, then (X*) ^ = Op((p — ^). 


Theorem 4. Let Condition 2 hold and p = 
in probability, provided lim Pjlogn < ojn < 

n—>-oo 


Then r, defined in 7p , converges to r 
)Vlogn} = 1 . 


4 Numerical properties 


We illustrate the proposed method with both simulated and real data examples below. Note that 
the comparison with Johanson’s (1991) likelihood method is carried out for Example 1 only, as 
Examples 2 & 3 consider the settings with di, • • • , dm > 1 for which his method is not applicable. 
Example 1. Let in model (12.1|) all components of xt 2 be stationary AR(1) and all components 
of xti be ARIMA(1,1,1) processes. The AR(1) coefficients are generated independently from 
U (—0.8,0.8), where the AR and MA coefficients in the ARIMA(1,1,1) are generated independently 
from 17(0.3,0.8) and 17(0,0.95). The innovations in these processes are independent N{0, 1). The 
elements of A are generated independently from 17(—3,3). We consider various combinations for 
p, r = p/\ and r = 3, and sample size n between 500 and 2500; see Table [H For each setting, 
generate 500 replicates. We estimate the cointegration rank r using both the ratio method ()2.6p 
and the information criterion dlTD. Since the estimated cointegration rank is not necessarily 
equal to r, and A is not a half orthogonal matrix (as specified above), we extend the dehnition 
of discrepancy measure (|3.ip as follows: 

Di(Al(A2),Af(B2)) = \l -tr(A2A'B2(B'B2)-'B')|'^', (4.1) 

V rnctx( T*, ^ ^ 


where r* = rank(A 2 ), and B 2 is the p x r matrix consisting of the last r columns of (A~^)', as 
now xt 2 = B 2 yt. Then i7i(A4(A2), A4(B2)) S [0,1], being 1 if and only if A4(A2) and A4(B2) 
are orthogonal with each other, and 0 if and only if the two subspaces are the same. When r* = r 
and A'A = Ip, B 2 = A 2 and i7i(A4(A2), A4(B2)) = D{M.{A 2 ),M.{A. 2 )) dehned in (13.Ih . We 
use jo = 5 in the dehnition of W in (12.3p . 

We compare the performance of our procedure with Johansen’s (1991) trace test with sig- 
nihcance level a = 0.05. Since the limiting distribution, i.e. the distribution of (J'J^(clW)F' 
[^FF'drVo F(dW)0, in his test is nonstandard, we approximate it by the distribution of 
T T 1 T 




Xi_i - A_1 


t=i 


[^(Xi_i-X_i)(Xt_i-X_i) 

t=l 


-1 


[^(Xt_i-X_i)4 

t=l 


(4.2) 
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where £t = • • • , £t,p-r)\ = 0 and Xt = and {£t,i} are independent A^(0,1) vari¬ 

ables. Setting T = 1000, the critical values were calculated in a simulation with 6000 repetitions 
of the trace of (j4.2l) for p > 5. This is the procedure used in Johansen and Juselius (1990) for 
calculating the critical values with p < 5. 

Table [T] reports the relative frequencies (Freq) of the events {r = r} and {r = r}, and the 
average of the distance (Dist) iJi(Al(A 2 ), AJ(B 2 )) (see (I4.ip l for r = p/4 in a simulation with 500 
replications, where the penalty LOn in the information criterion IC{1) is taken as either = n^/^\p 
or and Xp is the smallest eigenvalue of W. Also included in Table [Dare the results 

resulted from applying the Johansen likelihood test for the transformed component series. From 
Table [H we see that our procedure always has higher relative frequencies and smaller distances, 
which indicates that our procedure outperforms Johansen’s likelihood method when r is relatively 
small. Similar pattern are observed in Table [2] with r = 3. 


Table 1: Relative frequencies for r = r, r = p/4 and average distance in simulation with 500 replications 
in Example 1. 



n 

500 

1000 

1500 

2000 

2500 

p 

Freq 

Dist 

Freq 

Dist 

Freq 

Dist 

Freq 

Dist 

Freq 

Dist 


Johansen 

0.390 

0.371 

0.452 

0.326 

0.490 

0.302 

0.480 

0.307 

0.514 

0.289 


ratio 

0.748 

0.174 

0.848 

0.105 

0.884 

0.081 

0.886 

0.079 

0.890 

0.074 

8 

IC(o./) 

0.654 

0.217 

0.780 

0.136 

0.802 

0.123 

0.818 

0.112 

0.852 

.091 


IC(a;^) 

0.448 

0.338 

0.572 

0.136 

0.628 

0.222 

0.656 

0.206 

0.690 

0.185 


Johansen 

0.210 

0.449 

0.344 

0.355 

0.380 

0.336 

0.400 

0.322 

0.464 

0.287 

12 

ratio 

0.658 

0.236 

0.794 

0.138 

0.770 

0.151 

0.844 

0.102 

0.840 

0.107 

IC(w/) 

0.556 

0.261 

0.708 

0.168 

0.748 

0.145 

0.796 

0.114 

0.824 

0.101 


icK) 

0.366 

0.358 

0.444 

0.299 

0.518 

0.258 

0.536 

0.247 

0.610 

0.206 


Johansen 

0.008 

0.604 

0.050 

0.503 

0.080 

0.456 

0.134 

0.425 

0.164 

0.406 

20 

ratio 

0.404 

0.390 

0.544 

0.299 

0.620 

0.243 

0.704 

0.184 

0.730 

0.169 

lC{ui) 

0.390 

0.342 

0.554 

0.245 

0.670 

0.183 

0.686 

0.154 

0.768 

0.121 


ICK) 

0.232 

0.417 

0.346 

0.331 

0.400 

0.294 

0.456 

0.256 

0.472 

0.245 


Johansen 

0 

0.696 

0 

0.595 

0.002 

0.549 

0.004 

0.522 

0.010 

0.501 

28 

ratio 

0.234 

0.489 

0.386 

0.372 

0.462 

0.332 

0.558 

0.280 

0.582 

0.250 

IC(a;/) 

0.252 

0.407 

0.454 

0.274 

0.546 

0.228 

0.610 

0.195 

0.700 

0.144 


ICK) 

0.176 

0.442 

0.270 

0.350 

0.334 

0.304 

0.358 

0.284 

0.436 

0.240 


Example 2. Let in model (12. Ij) all components of xt 2 be stationary AR(1) and all components 
of xti be ARIMA(1,2,1) processes. The AR(1) coefficients are generated independently from 
f/(—0.8,0.8), the AR and MA coefficients in those ARIMA(1,2,1) are generated independently 
from 17(0.3,0.8) and 17(0,0.95). The innovations in these processes are independent A(0,1). The 
elements of A are generated independently from 17(—3,3). We consider various combinations for 
p and r, and sample size n between 300 and 2500; see Table El For each setting, we replicate the 
simulation 1000 times. 

FiglU presents the sample ACF/CCF of yt for one instance with n = 1000, p = 6 and r = 2. 
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Table 2: Relative frequencies for r = r, r = 3 and average distance in simulation with 500 replications in 
Example 1. 



n 

500 

1000 

1500 

2000 

2500 

p 

Freq 

Dist 

Freq 

Dist 

Freq 

Dist 

Freq 

Dist 

Freq 

Dist 


Johansen 

0.834 

0.086 

0.838 

0.086 

0.844 

0.081 

0.856 

0.074 

0.842 

0.080 


ratio 

0.620 

0.252 

0.704 

0.197 

0.734 

0.176 

0.802 

0.131 

0.816 

0.122 

5 

IC(c.i) 

0.802 

0.130 

0.868 

0.087 

0.884 

0.077 

0.916 

0.058 

0.922 

0.050 


10(6.;;) 

0.902 

0.065 

0.922 

0.051 

0.940 

0.038 

0.958 

0.0284 

0.962 

0.025 


Johansen 

0.522 

0.277 

0.562 

0.247 

0.540 

0.255 

0.606 

0.217 

0.550 

0.248 

10 

ratio 

0.636 

0.242 

0.788 

0.145 

0.790 

0.144 

0.860 

0.095 

0.874 

0.084 

IC(wi) 

0.654 

0.210 

0.812 

0.118 

0.856 

0.089 

0.882 

0.072 

0.896 

0.062 


IC(w^) 

0.504 

0.275 

0.628 

0.201 

0.672 

0.175 

0.746 

0.134 

0.740 

0.137 


Johansen 

0.078 

0.594 

0.184 

0.485 

0.276 

0.423 

0.320 

0.392 

0.346 

0.371 

15 

ratio 

0.526 

0.311 

0.692 

0.192 

0.742 

0.163 

0.808 

0.119 

0.810 

0.116 

IC(ci) 

0.356 

0.381 

0.494 

0.285 

0.578 

0.235 

0.644 

0.196 

0.684 

0.170 



0.174 

0.498 

0.262 

0.428 

0.316 

0.390 

0.336 

0.372 

0.388 

0.337 


Johansen 

0 

0.749 

0.018 

0.656 

0.022 

0.618 

0.053 

0.579 

0.060 

0.572 

20 

ratio 

0.316 

0.443 

0.464 

0.315 

0.552 

0.268 

0.606 

0.233 

0.645 

0.207 

IC(wi) 

0.176 

0.527 

0.262 

0.440 

0.316 

0.396 

0.361 

0.362 

0.396 

0.337 


10(6.;;) 

0.100 

0.617 

0.114 

0.577 

0.140 

0.544 

0.176 

0.516 

0.178 

0.509 


Johansen 

0 

0.847 

0 

0.766 

0.002 

0.713 

0 

0.695 

0.013 

0.672 

25 

ratio 

0.242 

0.497 

0.315 

0.423 

0.374 

0.371 

0.413 

0.343 

0.530 

0.275 

10(0.1) 

0.132 

0.596 

0.166 

0.527 

0.170 

0.518 

0.213 

0.472 

0.246 

0.461 


10(0.;;) 

0.064 

0.690 

0.071 

0.660 

0.078 

0.644 

0.096 

0.612 

0.103 

0.603 


Johansen 

0 

0.905 

0 

0.831 

0 

0.796 

0 

0.769 

0.003 

0.747 

30 

ratio 

0.186 

0.559 

0.270 

0.456 

0.295 

0.423 

0.313 

0.405 

0.363 

0.368 

10(6.1) 

0.103 

0.656 

0.106 

0.616 

0.105 

0.593 

0.146 

0.551 

0.140 

0.548 


10(6.;;) 

0.056 

0.744 

0.030 

0.730 

0.060 

0.698 

0.063 

0.683 

0.045 

0.693 


By applying our proposed method to this sample, the sample ACF/CCF of the transformed 
xt = A'yt are plotted in Figl2j Those figures show clearly that all the components of yt are 
nonstationary while the last two components of xj are stationary. 

(Put FiguresUl about here.) 

Table [3] reports the relative frequencies of the events {r = r} and {r = r} in a simulation 
with 1000 replications, where the penalty ojn in the information criterion IC{1) is taken as either 
o;^ = or and Xp is the smallest eigenvalue of W. Also included in Table 

[3] are the results resulted from applying the Phillips-Perron unit-root test for the transformed 
component series. We choose the Phillips-Perron method among other unit-root tests as it is 
applicable with different integration orders. Note that when ujn = nXp, r = r. While the numerical 
results in Table [3] lend further evidence for the consistency of both estimators, hnite sample 
performance depends on choice of the penalty parameter large oJn should be used when r 
is relatively large. But the unit-root tests lead to very accurate estimates for the cointegration 
ranks for this example. 

The boxplots of L)i(A4 (A2), Ad(B 2 )) are presented in Figs. [3]-[5]for {p,r) = (6,2), (10,4) and 
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Table 3: Relative frequencies for r = r and r = r in simulation with 1000 replications in Example 2. 


(P, r) 

n 

300 

500 

1000 

1500 

2000 

2500 

p=6, r=2 

Ratio 

0.835 

0.887 

0.979 

0.993 

1.000 

0.999 

IcR) 

0.923 

0.953 

0.994 

1.000 

1.000 

1.000 

ICK) 

0.967 

0.985 

0.998 

0.998 

1.000 

0.998 

Unit-root test 

0.999 

1.000 

0.998 

0.989 

0.995 

0.992 

p—6, r=4 

Ratio 

0.278 

0.343 

0.715 

0.921 

0.970 

0.988 

IcR) 

0.543 

0.644 

0.920 

0.987 

0.994 

0.998 

IcR) 

0.762 

0.852 

0.986 

0.999 

1.000 

0.988 

Unit-root test 

1.000 

1.000 

1.000 

0.995 

0.998 

0.993 

p=10, r=2 

Ratio 

0.799 

0.906 

0.993 

0.992 

0.995 

0.991 

IcR) 

0.822 

0.95 

0.988 

0.984 

0.978 

1.000 

ICK) 

0.736 

0.904 

0.953 

0.922 

0.928 

1.000 

Unit-root test 

0.978 

0.991 

0.997 

0.991 

0.990 

0.986 

p=10, r=4 

Ratio 

0.333 

0.459 

0.880 

0.978 

0.988 

0.997 

IcR) 

0.594 

0.774 

0.982 

0.998 

0.999 

0.999 

ICK) 

0.802 

0.937 

0.996 

0.998 

0.999 

0.999 

Unit-root test 

0.999 

1.000 

0.998 

0.989 

0.995 

0.992 

p=20, r=6 

Ratio 

0.994 

0.998 

0.996 

0.996 

0.994 

0.991 

IcR) 

0.075 

0.565 

0.948 

0.940 

0.896 

0.882 

ICK) 

0.330 

0.791 

0.798 

0.691 

0.616 

0.558 

Unit-root test 

0.815 

0.289 

0.503 

0.805 

0.901 

0.858 

p=20, r=10 

Ratio 

0.000 

0.005 

0.479 

0.873 

0.946 

0.951 

IcR) 

0.000 

0.050 

0.857 

0.974 

0.991 

0.993 

ICK) 

0.003 

0.410 

0.972 

0.996 

0.995 

0.999 

Unit-root test 

0.994 

0.961 

0.976 

0.999 

0.994 

0.989 

p=20, r=14 

Ratio 

0.000 

0.000 

0.026 

0.356 

0.753 

0.874 

icK) 

0.000 

0.000 

0.254 

0.791 

0.949 

0.983 

ICK) 

0.000 

0.015 

0.717 

0.958 

0.993 

0.996 

Unit-root test 

0.987 

1.000 

0.998 

0.999 

0.993 

0.996 


(20,14) respectively. For all the settings reported, T>i(A4(A2), A4(B2)) decreases as the sample 
size n increases. 

(Put Figures [3-El about here.) 

To illustrate the impact of jo used in defining W in (|2.3I) , we ran the simulation with n = 500 
and jo taking 7 different values between 5 and 100. Each setting is repeated 500 times. The 
results are reported in Table 01 The different values of jo lead to about the same performance in 
terms of the relative frequencies for r = r and the means and the standard deviations for distance 
F)i(A 4 (A 2 ), A 4 (B 2 )). For example, when p = 10 and r = 2, the estimation for r improves slightly 
when jo increases, while the estimation for the cointegration space becomes slightly worse. Overall 
Table 0] suggests that the proposed method may be insensitive to the choice of jo in (12.31) . 
Example 3. Now we consider an example in which some components of are /(I) and some 
are 7(2). More precisely in model (|2.1I) all components of xt 2 are stationary AR(1), s components 
of xti are ARIMA(1,1,1), and the other p — r — s components are ARIMA(0,2,1). The AR(1) 
coefficients are taken as —0.8 + 1.6i/r, i = 1,2, ■ ■ ■ ,r. The ARIMA(1,1,1) coefficients are taken 
as 0.3 + O.bi/s and 0.2 + O.Oi/s, i = 1,2, ■ ■ ■ , s respectively. The ARIMA(0,2,1) coefficients are 
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Table 4: Relative frequencies for r = r with and means and standard deviations (STD) of 
Di(A^(A 2 ), Af(B 2 )) in simulation with 500 replications in Example 2. 


(P. r) 

jo 

5 

10 

20 

30 

40 

50 

100 

p=6, r=2 

Relative frequency 

0.984 

0.984 

0.988 

0.988 

0.982 

0.986 

0.986 

Di 

mean 

0.010 

0.012 

0.009 

0.007 

0.013 

0.010 

0.010 

STD 

0.083 

0.088 

0.077 

0.070 

0.094 

0.083 

0.083 

p — 6, T — 4: 

Relative frequency 

0.862 

0.862 

0.870 

0.874 

0.878 

0.866 

0.878 

Di 

mean 

0.074 

0.078 

0.075 

0.073 

0.068 

0.077 

0.070 

STD 

0.187 

0.201 

0.199 

0.197 

0.187 

0.199 

0.192 

p=10, r=2 

Relative frequency 

0.882 

0.884 

0.916 

0.900 

0.934 

0.936 

0.942 

Di 

mean 

0.008 

0.006 

0.006 

0.009 

0.007 

0.008 

0.009 

STD 

0.055 

0.045 

0.045 

0.063 

0.055 

0.062 

0.070 


generated independently from 17(—0.95,0.95). The innovations in these processes are independent 
A^(0,1). Let the elements of A be generated independently from C/(—3,3). We consider varions 
combinations for p, r, s, and the sample size n. For each setting, we replicate the simulation 1000 
times and estimate the cointegration rank r using both the ratio method (12.61) and the information 
criterion (HID with Un equal to either = n^/^Xp or = vP'/^Xp. 

FigE] plots the sample ACF/CCF of yt for one instance with n = 1000, p = 6, r = 2 and 
s = 2. The sample ACF/CCF of the transformed = A'yt are plotted in Figl?) Those figures 
show clearly that all the components of yt are nonstationary while the last two components of xj 
are stationary. 

(Put Figures\^-\^ about here.) 

Table [5] reports the relative frequencies of the events {r = r} and {r = r} in a simulation 
with 1000 replications, where the penalty ojn in the information criterion IC{1) is taken as either 
or = n^/^Xp, and Xp is the smallest eigenvalue of W. The estimates for the 
cointegration ranks by the Phillips-Perron test are more accurate. Comparing to Table [3l the 
estimation for the cointegration rank r is less accurate than that for Example 1. This is due to 
the existence of different integration orders for the different components of yt, which implies that 
the eigenvalues of W are more diverse; see ()2.2[) . However the estimation for the cointegrated 
variables themselves is hardly affected. We plot the boxplots of the distances between the true 
cointegrated space A4(B2) and its estimator A4(A2), defined as in ()4.1[) . in Figs [SHS] for the 
four different settings for (p, r, s), where the cointegration rank r is either estimated by the ratio 
method or simply set at its true value. The distances for the estimation with true r are significantly 
smaller than those for the estimation with estimated r. These results indicate clearly that while 
the performance of the estimators for the cointegration rank r is not entirely satisfactory when 
the components of yt have different cointegration ranks, the transformed series xj = A'yt contain 
the well estimated cointegrated variables. 
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Table 5: Relative frequencies for r = r and r = r in simulation with 1000 replications in Example 3. 


(P, r, s) 

n 

300 

500 

1000 

1500 

2000 

2500 

(6, 2, 2) 

Ratio 

0.711 

0.778 

0.873 

0.918 

0.909 

0.873 

ic(^D 

0.788 

0.841 

0.877 

0.857 

0.783 

0.687 

IC(a;;l) 

0.476 

0.522 

0.623 

0.749 

0.846 

0.893 

Unit-root 

0.968 

0.969 

0.961 

0.951 

0.959 

0.947 

(6, 4, 1) 

Ratio 

0.186 

0.205 

0.290 

0.535 

0.698 

0.822 

IC(^),) 

0.406 

0.430 

0.600 

0.836 

0.925 

0.962 

IC(a;;l) 

0.020 

0.035 

0.039 

0.114 

0.235 

0.346 

Unit-root 

1.000 

1.000 

0.997 

0.999 

0.997 

0.998 

(10, 4, 1) 

Ratio 

0.056 

0.084 

0.384 

0.786 

0.936 

0.968 

IC(a;^) 

0.232 

0.306 

0.752 

0.940 

0.964 

0.937 


0.002 

0.007 

0.049 

0.303 

0.607 

0.801 

Unit-root 

0.961 

0.978 

0.970 

0.946 

0.907 

0.904 

(10, 6, 2) 

Ratio 

0.018 

0.035 

0.105 

0.372 

0.637 

0.801 

IC(wi) 

0.096 

0.122 

0.448 

0.744 

0.872 

0.914 

IC(a;;l) 

0.000 

0.000 

0.002 

0.046 

0.192 

0.369 

Unit-root 

0.986 

0.984 

0.986 

0.960 

0.954 

0.952 

(20, 10, 1) 

Ratio 

0.000 

0.002 

0.043 

0.493 

0.802 

0.925 

IC(w),) 

0.001 

0.003 

0.354 

0.831 

0.909 

0.898 


0.000 

0.000 

0.000 

0.033 

0.250 

0.522 

Unit-root 

0.577 

0.720 

0.772 

0.734 

0.654 

0.633 

(20, 14, 2) 

Ratio 

0.000 

0.000 

0.000 

0.060 

0.295 

0.560 

ic(wD 

0.000 

0.000 

0.021 

0.409 

0.698 

0.845 

ic(^;l) 

0.000 

.000 

0.000 

0.001 

0.008 

0.046 

Unit-root 

0.962 

0.939 

0.879 

0.873 

0.854 

0.827 


Example 4. For an empirical example, we consider the 8 monthly US Industrial Production 
indices in January 1947 - December 1993 published by the US Federal Reserve, namely the total 
index, manufacturing index, durable manufacturing, nondurable manufacturing, mining, utilities, 
products and materials. The original 8 time series are plotted in FigllOl Applying the proposed 
method to these data, the transformed series xt = A'yt are plotted in Fig {TT] together with their 
sample ACF. The ratio method (j2.6j) claims r = 3 cointegrated variables. The IC method (I2.7ji 
leads to r = 3 with LOn = and r = 4 with oon = where As is the minimum eigenvalue 

of W defined as in (j2.3jl . Indeed the last 3 or 4 series in Fig {TT] certainly look stationary. 

We also apply Johansen’s (1991) likelihood method to this data set. Both the trace and the 
maximum tests indicate r = 4. The corresponding transformed series together with their sample 
ACF are plotted in FigJT^ 

Let A 2 denote the last 4 columns of A and B 2 consist of the loadings for the last 4 component 
series displayed in Fig ll2[ i.e., the columns of A 2 are the loadings of the 4 cointegrated variables 
identified by the proposed method in this paper, and the columns of B 2 are the loadings of the 4 
cointegrated variables identified by Johansen’s likelihood method. Then 

Di(M(A2 ),M(B2))^ = 1 - ^tr{A2A'2B2(B^B2)"^B'2} = 1 - 0.9816 = 0.0184. 

This indicates that the two sets of cointegrated variables identified by the two methods are 
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effectively equivalent. 


5 Fractional cointegration 


Fractional cointegration has attracted increasing attention in recent years, see, e.g., Robinson 
and Hualde (2003), Chen and Hurvich (2006) and Robinson (2008). In this section, we generalize 
the method presented in Section 2 to the cases when the components of yt may be fractionally 
integrated. For simplicity, we now assnme p is fixed. 

We first present a gentle introduction for fractionally integrated processes and the concept of 
fractional cointegration. 

Let vf = vtl{t > 0) and for any a E R, 


A ^ = aj{a) 

j=0 


r(j + a) 
F(a)F(j + 1) 


be formally dehned as in Hualde and Robinson (2010), where B is the backshift operator. With 
these definitions we can extend the definition of the I {di, process vt in Section 2 to non¬ 

negative real-valued di, such that di ^ k — for any integer k. Note that for di < 1/2 the ith 
element of is ‘asymptotically stationary’ (due again to the truncation in the definition of v^), 
while di > 1/2 represents the ‘nonstationary’ region. 

With this extended definition to cover fractional time series we again consider apx 1 observable 
I (di, • • • , dp) time series yt satisfying (2.1), partitioning as before. However we also extend the 
definition of cointegration, saying that yt is cointegrated if at least two di are equal and exceed 
1/2 and there exists a linear combination giving nonzero weight to two or more of these that is 
I (c) for 0 < c < 1/2. Thus let dmin > 1/2 be the smallest integration order of elements of xti and 
let 3 € [0,1/2) be the largest integration order of elements of xt 2 - Thus, each component of x '^2 is a 
cointegrating error of yt- Let A = (Ai, A 2 ) and A4(A2) be defined as in Section 2. Then Ai(A 2 ) 
is called the fractional cointegration space and r is called the fractional cointegration rank. We 
estimate Ai{A 2 ) and r in the same manner as in Section 2, though now a large jo should be used 
in ()2.3I) . 

Let £i = (ej, ■ ■ ■ jsf)' be the p x 1 1(0) with mean zero such that = el. Let S„(t) = 

and /i = {i : di < 1/2, I <i <p}. 


Condition 3. 


(i) E||ei ||2 < °° some q > max(4, 2/(2dmin — 1)) and for any i,j £ R, as n ^ 00 , 


1 . • 
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(ii) There exists an i.i.d mean zero p x 1 normal vector {wj} such that as n —>■ oo, 

[nt] 

sup ||S'n(t) — ^^Wj ||2 = Op(n^/®), for some s > 2. 


0<t<l 


i=l 


Remark 5. Condition 3 is mild and satisfied by either of the following proeesses. 
1. Suppose £t follows a linear process: 


^ ^ t Ij 2, ■ 


fc =0 


and {e^} are i.i.d vectors with mean zero, Ee^eJ = Sg > 0, EHe^Hl < oo for some g > 4, the 

OO 

pxp coefficient matrices satisfy A:||Cfc|p < oo. Then, by Lemma 2 of Marinucci and 

k=0 

Robinson (2000), we have (ii) of Condition 3 holds, (i) follows by ergodicity. 

2. Suppose £t follows a generalized random coefficient autoregressive model: 

s:t = C-tSt-i + Gt ( 5 - 1 ) 

and {(Ct,et)} are i.i.d random variables TOt/iE||Ci ||2 < 1 and E||e||'^ < oo for some q>2, 
then (ii) of Condition 3 holds with s < mm{g,4}, see Corollary 3.4 of Liu and Lin (2009). 
Similarly, (i) follows by ergodicity. 

Theorem 5. Let r be known. Lfnder Condition 3, Z)(A 4 (A 2 ), A 4 (A 2 )) = Op(l). Furthermore, 

D(M(A2),M(A2)) = 

Let f* = max{j : Ap+i_j/(n'^™+'^“^Ap) <1, 1 < j < pj and r be defined as in (12.71) . 
Theorem 6 . Let Condition 3 hold. 

(i) lim„_).oo= r) = 1 provided 1 <r <p and 

(ii) lim^^oo P{r = r) = l provided lim„^oo(l/wn + = 0 . 

6 Conclusions 

We propose in this paper a simple, direct and model-free method for identifying cointegration 
relationships among multiple time series of which different components series may have differ¬ 
ent integration orders. The method boils down to an eigenanalysis for a non-negative definite 
matrix. One may view that the components of the transformed series = A'yt are arranged 
in the ascending order according to the “degree” of stationarity; reflected by the magnitude of 
the eigenvalues of W. Then in addition to the proposed information criterion for determining 
the cointegration order, unit-root tests may be applied to determine the number of stationary 
components of xj. 

In this paper we only focus on inference on the cointegration rank r and cointegration space 
A 4 (A 2 ). One practically relevant open problem is to identify the subspaces of A4(Ai) according 
to the different integration orders of the components of x^i. Further, it would be interesting to 
consider letting Jq in (|2.3I) diverge together with the sample size n. 
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7 Appendix: Technical proofs 

7.1 Proof for Section 13.11 


Let 


= diag 


^ n-J \ / ^ n-j 

- - Xl)(Xii - Xi)' J , f - ^(xt+j-2 - X2)(Xf2 - X2) 


t=l 


t=l 


=: diag(5]Ji,Sj2), 


W* = Ylj=o =■ diag(DJ, D|) and be the py. p orthogonal matrix such that 


= r,A,, 


where is the diagonal matrix of eigenvalues of W*. Since x^i is nonstationary and Xf 2 is 
stationary, intuitively ^ (xt+i,i -xi) (x^ -xi)' and ^ X]”r/(xf+j ,2 - X 2 )(xi 2 -X 2 )' do not 
share the same eigenvalues, so F^, must be block-diagonal. Dehne = AW^A', then 

= ASV^A' = 

This implies that the columns of AFa, are just the orthogonal eigenvector of W^. Since F^, is block- 
diagonal, it follows that Ai{X 2 ) is same as the space spanned by the eigenvectors corresponding 
to the smallest r eigenvalues of W^. As a result, to show the distance between the cointegration 
space and its estimate is small, we only need to show that the space spanned by the eigenvectors 
of can be approximated by that of W. This question is usually solved by the perturbation 
matrix theory. In particular, let 

w = W*' -h XW, XW = W - w^. 


and 


sep(Df,D|^ 


min \X — u\, 
AGA(Df),AieA(D|) 


where A(A) denotes the set of eigenvalues of a matrix A. When ||AW^|| = Op(sep(Df, Df)), one 
can use the perturbation results of Golub and Loan (1996) to establish the bound of Theorems 
1, 3 and 5, see also Lam and Yao (2012) or Chang, Guo and Yao (2014). However, in our setting 
sep(Df,D 2 ) can be of smaller order than ||AW^||, i.e., sep(Df,D 2 )/||AW^|| 0 as re —)• 00 

and the above method will not work. 

To fix this problem, we adopt the perturbation results of Dopico, Moro and Molera (2000) 
instead. A similar idea was used by Chen and Hurvich (2006) to recover their fractional cointegra¬ 
tion spaces via the periodogram matrix, using a random diagonal block matrix instead. However, 
because of the quadratic form of (= cannot find a normalizing constant 
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matrix Cn such that CnW^C„ = Oe(l) or CnW^Cn = O^iX), as a result, the argument of Chen 
and Hurvich (2006) based on the perturbation bound of Barlow and Slapnicar (2002) cannot be 
used. To this end, we first establish some lemmas (i.e. Lemmas 7-10 below) and legate their 
proofs to a supplementary material. 

For 1 < r < p - r, set p^t) = W^{t), /^,(t) = dt, ^ = Ezl and define 

F\t) = flit)- flit)dt, GXt) = ^-^ -i, G, = -^Grf(t). 

Jo d\ n 

Then, we have the following weak convergence result for the sample autocovariance. 

Lemma 7. Let L^t) = G^it) — G^- Suppose x\ ~ /(di), 1 < * < p — r, then under Condition 1, 

’ 1 < ^ < P - T-) l<i<p-r^ and (7.1) 


1 *6 

( di+ 1/2 ~ ~ ldiLdiit)){xi -Exi),i<p-r,p-r + l < j < p) ^ 0. (7.2) 

* t=i 

Next, we establish a bound for the eigenvalues of SJ and A'SjA =: S^. 

Without loss of generality, we assume the first si components of xji are /(ai), the next S 2 
components are 1 ( 02 ) and the last si components of x^i are /(a^), that is, 


I{ai) 


IG2) 


d{cLl) 


xti = [xj,- 


) Xt ) 


X 


Sl + 1 


) Xt 








E 




i=i y 


where oi > 02 > • • • > a; = dmin are positive integers and Si = p — r. For 1 < f < f, define 
= I]}=i Sj- Then for any xt(si) := (x^*+\ • • • if /Xj := pui+i, ■ ■ ■ , hui+si)' + 0, there 

must exist a Sj x (s, — 1) matrix Pj and si x 1 vector fit such that P(Pi = I(s._i), (Py/x*) has 
full rank sy P(/Xj = 0 and A(/Xj = 1, where denotes ax a matrix. Let Bj = {Pi,n~^Gfx.y if 
/Xj 7 ^ 0 and Bj = I^. if /Xj = 0, and ©„ = diag(Bi, B 2 , • • • , B;, I,.). Define 

Si Si 

Dni = diag(^ D „2 = 

and D„ =: diag(D„i,D„ 2 )- Let H'^it) = P/d\ — l/(d-|-l)!, T*(t) be given as in Lemma 7, 
F,(t) = (F"^+i(t),--- ,T"^+^*(t))', Mj(t) = (F'(t)P„77“^(t))'/(/x, / 0) + F,(t)/(/x, = 0), and 
M(t) = (M(^(t), M 2 (t), • • • ,M((t))h By Lemma 7 and continuous mapping theorem, we have 

Lemma 8 . Let rj(a:) = diag(^ “ xi)(xti — xi)', Cov(xi+j^ 2 , xi^ 2 )y Under Condition 

1, we have 

diag(^'M(t)M'(t)dL Cov(xi+,- 2 , xys)). 
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Let F^{t), 1 < i < p—r be defined in Lemma 7, where W^{t) = aiiB^{t) and I < i < p—r 

are independent Brownian motions. Let F(t) = {F^(t), F'^(t), ■ ■ ■ , FP~^(t))'. We have 
Lemma 9. Under condition 2 and p = with 0 < r < 1/2, 

/ ri 


-diag(^ J F{t)F'{t)dt, Cov(xi+j- 2 , xi, 2 )j ^ = Op(l). 
Further, F{t)F'(t) dt is positive definite. 

Lemma 10. Under Condition 1, or Condition 2 and p = we have 

max ||D-i0„(Sj-rf)0;D 
0<j<jo 


(7.3) 


0 <j<jo 


" X 

U 


'J) 0 ;D- 1||2 ^0 and 

(7.4) 

rj)0;D-i||2^o. 

(7.5) 


Proof of Theorem 1. Since 

{I1(A1(A2),A1(A2))}^ = -{tr[A2(/p — A2A2)A2]} < ||A2(A2A2 — A2A2)A2||2 < 2||A2 — A2||i, 

it follows from Theorem 1.5.5 of Stewart and Sun (1990) (see also Proposition 2.1 of Vu and Lei 
(2013)) that 

D(M(A2),M(A2)) < v^||A2 - A 2 II 2 < \/2||A2 - A 2 IIF < 2v^||sin0(A2,A2)||F, (7.6) 

where 0 (A 2 , A 2 ) = arccos[(A 2 A 2 A 2 A 2 )^/^] is the canonical angle between the column spaces of 
A 2 and A 2 . Let p = min^g^^j^^^ pGA(D'') 1"^ “ where A(D|) consists of the r smallest 

eigenvalues of A'WA =: W^. By Theorem 2.4 of Dopico, Moro and Molera (2000), we have 


sin0(A2,A2)||F < \\{Wy)-^/^AWy{W)-^/^\\F/p. 


(7.7) 


Note that 

Thus, by equations ()7.6p . ()7.7I) and ()7.8p . we have 

D{M{A2),M{A2)) < (||(W3')-1/2(W)1/2||^^||(^P)1/2(^)-1/2||^)/^^ 

Next, we show that||(W^)“^/^(W)^/^||F = Op(l), which is equivalent to 

||(\W)-i/2(w")i/2||^ = Op(l). (7.9) 

Note that 

jo jo 

0 < Sg < (W^)^/2 < ^{5 ]J(sJ)'} 1/2 and 0 < So < < ^{sJ(sJ)'}^/2^ (7.10) 

j =0 j =0 
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It follows from (|7.10p that 


Jo 


||(W-)-V2(W-)V2||^<^||(5]g)-l{gJ(gJy}l/2||^. 

3=0 

Thus, for (I7.9p . it is enough to show the eigenvalues of are Op(l), 

which can be transformed to show that 

the solutions A of — AX)g| = 0 are Op{l). (7-11) 

Since diag ^fg M(t)M'(t) dt, Var(xi^ 2 )^ > 0, by Lemma 10 the solutions (A) of equation 


D-i©4e"(s")'}1/2©;d-i - AD”1©„5]S©:,D-1| = 0 


(7.12) 


are bounded in probability. Thus, we have ()7.11l) and (17.Op as desired. 
Similarly, we can show 

||(WJ^)l/2(W)-V2||^ = ||(W-)l/2(W-)-V2||p = Op(l). 


(7.13) 


Using equations ^7.101) and (I7.13p . the remaining proof for Theorem 1 is to show that there exist 
two positive constants ci,C 2 such that in probability rj > provided |/o| > 2 or 

|Io| = 1 and Ezj° = 0 and r] > jy/jo provided |/o| = 1 and Ez/° / 0 . 

Define Ai(A) be the i-th eigenvalue of a matrix A. Note that diag ^ M(t)M'(t) dt, Var(xi^ 2 )^ > 

0. By Lemmas 8 and 10, it follows that when |/o I > 2 or |/o| = 1 and Ez/° = 0 , Ap_r(ElJ) = 
( 9 ^(j.j 2 (i,Tiin-i) Ap_r+i(Elj) = Oe(l)- Thus, there exists two positive constants 03,04 such that 
in probability 


Ap-.(W") > Ap_,(5]S(Sg)0 > C3n2(2'^—-1) 


(7.14) 


and 


30 


1 2 


03 < Ap_,+i(So(So)') < Ap_,+i(W") < [Ap_,+i( j;{S^.(%.)' 

1=0 

Hence, in probability 

d > - O4jol/Y^C3n2(2'^min-i)c4j2 > 

Similarly, we have |Io| = 1 and Ez/° 7 ^ 0, then in probability, 

T] > 


< C4jo- 


(7.15) 


(7.16) 


Since jo is fixed, combining p7.9p . p7.16p and p7.16p . we complete the proof of Theorem 1. □ 
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Proof of Theorem 2. First, we prove the consistency of r. Similar to (I7.15p . there exist two 
positive constants C 5 , cq such that 

C5 < Ap(So(£o)') < Ap(W") < cdl (7.17) 

Since Aj = Aj(W) = Ai(AW^A') = Aj(W^), equations (|7.14l) . (I7.15P and (I7.17P imply when 
|7o| > 2 or |/o| = 1 and = 0 , 

Xi ^ ^ Cdo f . 

^- tor ^ < p — r and < —- ^or^=p — r + 1 ,-- - ,p 

nXp cej'o nXp ^^5 

hold uniformly in probability. As a result, we have hm„_,.(X) P{r = r} = 1 provided |/o| > 2 or 
I Jo I = 1 and = 0 . The consistency of r under the setting that |/o| = 1 and 7 ^ 0 , can be 
proved similarly. 

As for the consistency of r, it follows from its definition that 

r r 

\+i-j + {p- r)uJn < ^ ^p+i-j + {p- r)uJn- (7.18) 

1=1 1=1 

Suppose that r < r, it follows from (17.181) that 

r 

{r-r)UJn< Ap+i-j < {r -P)Xp+i.r. (7.19) 

l=f+l 

However equation (I7.15P implies that in probability, 

Ap+i-r = Ap+i_,(AW"A') = Ap+i_,(W") < cdl 

Since oJn/J q —^ 00 , it follows that equation (I7.19P holds with probability zero. This gives that 

lim P{r < r} = 0. (7.20) 

n—>-oo 

On the other hand, if r > r, equation (j7.18p yields 

r 

(t dXp—r ^ ^ ^ "^P+l”! — (i* i’)ain- (7.21) 

l=r+l 

By (I7.14p . we have when |/o| > 2 or |/o| = 1 and Ez^ = 0, 

Vr = Ap+i_,(W") > (7.22) 

A similar argument to (|7.14p deduces when |/o| = 1 and Ez^ / 0, 

Vr- = Ap+i_,(W") > (7.23) 

Since 0 as n —)■ 00 , equations (I7.2ip - (l7.23p imply 

lim P{r > r} = 0. (7.24) 

n—>-oo 

Equation (j7.20p together with (I7.24p give the conclusion (ii) of Theorem 2 as desired. □ 
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7.2 Proofs for Section 13.21 


Proof of Theorems 3 and 4. Theorem 3 can be shown similarly to Theorem 1 by using 
Lemma 9 instead of Lemma 8, except that when p ^ oo, 




P \ 1 / 2 ' 

|=0,(pi/=), 

i=l 


where Xi,l < i < p are solutions of ()7.1ip . As a result, ()7.9p should be replaced by 

And Theorem 4 can be shown similarly to Theorem 2. We omit the details. 

7.3 Proofs for Section [5] 


(7.25) 

□ 


To prove Theorems 5 and 6, we first introduce some notation. Let kni = ^^‘^I{di > 1/2) + 

n'ii+y^I{di < 1/2) and A(t - s) = {t - s)‘^*-yT{di)I{di > 1/2) + {t - s)‘^*/T{di + l)I{di < 1/2). 
Define K„ = diag(fc„i, • • • , knp), A(L s) = diag(Ai(t — s), ■ ■ ■ , Ap(t — s)) and 


Bo — 0, Bi — 


(B/,... ,5/)'= fA{t,s) 
Jo 


dWs, Vt = Bt- / Btdt, 


where is given in (ii) of Condition 3. 

Lemma 11. Let If = {i : di 
(ii) of Condition 3, we have 


Lemma 11. Let If = {i : di > 1/2}, = (xj, i G /)' and Z„(t) = 


. Under 


K;iZ„(t) Bi, onD^lf. 


(7.26) 

Proof. Let d/j = {di : i G Iij, then ^jh is a integrated fractional process with order dj.^ + 1, 
each of its components has order larger than 1/2. Using (ii) of Condition 3 instead of Marinucci 
and Robinson (2000) Lemma 2, we can show this lemma similarly to their Theorem 1. □ 

Lemma 12. Under Condition 3, for any 0 < j < jo, we have 
(i) // Ji = 0, then 

BfU'dt and 




n j n 


diag( 


(ii) If h ^ 0, then 


1 1 ~ 


^ diag^y \Jt,ifV[icdt, CoY{^t+j,h^t,h)j and (7.27) 

L-iy}L-i A diag(y|'ut,rfU(,,cdL Cov(xi+,-,,xt,,j), (7.28) 

where L„ = diag(/„i, • • • ,lnp), Ini = n‘^'~^^‘^I{di > 1/2) + I{di < 1/2). 

By Lemma 12, Theorems 5 and 6 can be established in a similar manner as to Theorems 1 
and 2. Therefore we omit the detailed proofs. 
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Figure 1: Sample ACF/CCF of y* with n 


1000, p = 6 and r = 2 in Example 1. 
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Figure 2: Sample ACF/CCF of % with n = 1000, p = 6 and r = 2 in Example 1. 
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Figure 3: Boxplots of Ili(A4(A2), A1(B2)) for p = 6, r = 2 and 500 < n < 2500 in Example 1. 
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Figure 4: Boxplots of F>i(A4(A2), A^(B 2 )) for p = 10, r = 4 and 1000 < n < 3000 in Example 1. 
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Figure 5: Boxplots of L>i(A4(A2), A4(B2)) for p = 20, r = 10 and 1000 <n< 3000 in Example 

1 . 


28 

























































- 1.0 0.5 - 1.0 0.5 - 1.0 0.5 - 1.0 0.5 - 1.0 0.5 - 1.0 0.5 


Srsi & Srs6 


Srs1 & Srs2 


Srs1 & Srs3 


Srs1 & Srs4 


Srs1 & Srs5 


IIUIIIUIIIIIIUIIIUI 

s - 



iiyiiiuiuiiiyiiiui 

s - 



iiiJiiiumiiiuiiiiJi 


yyiiiiiiiiyiiiyi 



iiiiMiniTini 




yyiTiyiriiiiTiyi 






Lag 

Srs2 & Srs1 


Lag 

Series 2 


Lag 

Srs2 & Srs3 


Lag 

Srs2 & Srs4 


Lag 

Srs2 & Srs5 


Lag 

Srs2 & Srs6 


-20 -10 C 

Lag 

Srs3 & Srs1 


Lag 

Srs3 & Srs2 


Lag 

Series 3 


Lag 

Srs3 & Srs4 


Lag 

Srs3 & Srs5 


Lag 

Srs3 & Srs6 


yyiiiyiiiiiiyyi 

2 - 



iiyiiiyiyiiiyyi 

2 - 



IllllilUIUlilUlilUl 


yyiiiuiyiiiyiiiyi 




1 




iiyiiiyifiiiiyiiiyi 





-20 -10 0 
Lag 


-20 -10 
Lag 

0 


0 5 15 

Lag 


□ 5 15 

Lag 


0 5 15 

Lag 


0 5 15 

Lag 

Srs4 & Srsi 


Srs4 & Srs2 


Srs4 & Srs3 


Series 4 


Srs4 & SrsS 


Srs4 & Srs6 



iiyyyiiiyyiiiiiiii 

2 - 



lyyyyiyyyyiiiii 

2 - 


2 - 


yyiiiyyyiiiiyyy 

a- 



yyiyyyyiiyyyy 

a- 



yiiyyyyiiyyyyi 


yyyiiiyyyyiiyy 

-20 -10 0 
Lag 


-20 ' -10 
Lag 

o 

-20 -10 O 

Lag 

05 ' 15 ' 

Lag 


0 5 15 

Lag 


0 5 15 

Lag 

SrsS & Srsi 


SrsS & Srs2 


SrsS & Srs3 


SrsS & Srs4 


Series S 


SrsS & Srs6 


jmiiiuiiiiiiuiiiui 

2 - 



iiLiiiiuiiJiiiyiiiiJi 

2 - 



y[iiiiyiyiiiyiyyi 


IIUlilUlillilLIIlUI 



iinnidinyiyiTini 




yyniyifiniyiiiyi 





1 

ro- 

o 

1 

o 

o- 

' 

I 

N>- 

O 

I 

O 

1 1 
o 

H —1—1—1— 
-20 -10 0 

' 

'“T—1—r~i— 
-20 -10 o 

' 

T—1—1—1—H 

0 5 15 

' 

4—1—1—1—H 

0 5 15 

Lag 


Lag 



Lag 


Lag 


Lag 


Lag 

Srs6 & Srsi 


Srs6 & 

Srs2 


Srs6 & Srs3 


SrsS & Srs4 


SrsS & SrsS 


Series S 

11111111111111111111111 

2 - 



.. 

2 - 



IllllilUIUlilUlilUl 


yyyiiiyyyyiiiiii 



y* 

1 

1 




yyyyiiyyyyiiiii 





-20 -10 0 

' 

-20 - 

0 

0 


-20 -10 O 

' 

-20 -10 O 

' 

1 

M- 

o 

1 

o 

o- 

' 

0 5 15 

Lag 


Lag 



Lag 


Lag 


Lag 


Lag 


Figure 6: Sample ACF/CCF of yt with n = 1000, p = 6, r = 2 and s = 2 in Example 2. 
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Figure 7: Sample ACF/CCF of with n = 1000, p = 6, r = 2 and s = 2 in Example 2. 
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Ratio method 


True r 



Figure 8: Boxplots of Di{M.{A. 2 ),M.{R 2 )) with the estimated r (left panel) and the true r (right 
panel) for Example 2, while p = 6,r = 2, s = 2 and 500 < n < 2500. 
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Figure 9: Boxplots of Ii>i(A4(A2), A4(B2)) with the estimated r (left panel) and the true r (right 
panel) for Example 2, while p = 10, r = 4, s = 1 and 1500 < n < 5000. 



Figure 10: Time series plots of the 8 monthly U.S. Industrial Production indices in January 1947 
- December 1993. 
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Figure 11: Time series plots of the estimated by the proposed method and their sample ACF 
for the 8 monthly U.S. Industrial Production indices. 
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Figure 12: Time series plots of the estimated xj by Johansen’s method and their sample ACF for 
the 8 monthly U.S. Industrial Production indices. 
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S.l Proofs of the lemmas 

Proof of Lemma 7. For any I{di) process x{, we can write 

V^^xi = Ezi + {zi-Ezi) =:w + d. 

Let C(0) = Ci, C(0) = W and 

uKj) = jz uiU - 1), vlij) = X C(j - !)• 

Then 

t t 

x[ = Ulidi) + Vlidi) = Y.ul{di - 1) + Y.vZdi - 1). (S.l) 

i=i i=i 

By induction, we have 

di—1 

Vi{dl)=^XlW{t + j)/dl\ = ^ilGd,{t). (S.2) 

j=0 

On the other hand, since = 0, by (i) of Condition 1 and continuous mapping theorem, it 
follows that 


fd^s), on D[0,1]. 


(S.3) 


1 


Thus, by (IS.l[) - (IS.3p . 


ix\ns]-f^iGdi{[ns]))/n^^ onT)[0,l]. (S.4) 

Since Sl^{ti), 1 <i <p converge to their limiting distribution jointly, (7.1) follows from (IS.41) and 
the continuous mapping theorem. 

As for conclusion (7.2), by the joint convergence condition (see (i) of Condition 1) and (7.1), 

{x{ — Ex;^), I < i < p — r, p — r + 1 < j < ^ 

on D[0, 1] X D[0, 1]. (ii) of Condition 1 implies that E|xip < oo. This gives 

1 ” . 

max \xi — Exi\/^/n = Op(l), and — \xi — Ex^| = Op{l). (S.5) 

1<S<T1 Tl 

S = 1 

Thus, by Theorem 3.1 of Ling and Li (1998), we have 

1 " . . 

ndi+i /2 “ l^iLdi{t)){xl - Ex{) 0. 

Since p is fixed, we have (7.2) as desired. □ 


* c—1 


Proof of Lemma 8. For any 1 < i < ^, we define x(sj) = ^ When /Xj = 0, Lemma 7 

gives 


Bi(xf(si) -x(si)) _ (xf(si) -x(si)) X 


n 


a.i-l/2 


n 


ai-l/2 


Fi{t), on ]Jl»[0, 1]. 
i=i 


(S.6) 


When /Xj / 0, by /x'/x^ = 1, we have Bi^tisi) = ((P'xt(si))',n + n ■ 

Note that 

-^(Ga,{[nt]) + /x*X[„t](si)) = lim ^G„,([nt]) = 1™ ^ H ([^^] + j) = (S.7) 

71—^00 7x“® n—7-00 n—>-oo n“* -*■ n .^! 


n—7-00 n 


n—^QO n 


j=0 


ad 


Thus, by P(/Xj = 0, Lemma 7 and continuous mapping theorem, we have 

Bi(xt(sj) - x(sj)) ji 


((P'F,(t))',i7“Ht))' on 

i=i 


Combining (jS.bp . (jS.8jl and the joint convergence condition ((i) of Condition 1) yields 


(S.8) 


Bi(xf(si) - x(sj))y 


n' 


ai-l/2 


p—r 


,1 < i </) (M'(t),l < z < 0 = M'(t) on JJd[0,1]. (S.9) 

i=i 


2 












Let B = diag(Bi,B 2 , • ■ • ,Bi) = (bij). By (IS.9P and continuous mapping theorem, we have 
® {~ “ xi)(xti - XI^(D-iB(xii - xi))(D“iB(xii - xi))' 


t=i 


t=i 


M(t)M'(t) dt 


and complete the proof of Lemma 8. 


(S.IO) 

□ 


Proof of Lemma 9. First, we show (7.3). To this end, it is enough to show 

-Xl)'j ^ F{t)F'{t)dt ^ = Op(l). 


t=i 


(S.ll) 


Let ) • • • !?f be an integrated process with components satisfying 

= auvi := n- 

For any given 1 < i, j < p — r, 

1 ” 

Ei("i - w - - (fi - - ?)i 

" 1.1 

ST3- Ei“^< - «- + 3TS7 Eie - «')N - e - - J'’)] 




t=l 




t=l 


= : r^- + r?-. 

' ij ^ ' ty 


By induction, it is easy to show that under condition (3.2), 

1 * 

sup sup E[(x)-^*)/n‘^*-L2]2 < sup snp -F{Sie\f = (S.12) 

l<i<p — r 1<£<72 l<2<p—7* l<t<72 

sup sup = 0(1) and sup sup E(x)/n'^'“^/^)^ = 0(1). (S.13) 

l<2<p—7" l<t<72 l<2<72l<t<72 

Thus, by equations ()S.12p . (IS.13P and the independence of the components, 

rj=i 


which implies 


D 


D 


= Op{prP 


i ^(xti - xi)(xii - xi)' j - (i - 0(4, - ly 

t=l / \ t=l 

where 4 = (4^) 4^)''' i ff’~''y■ Thus, for the proof of (IS.llh . it suffices to show 

1 p 

Effi - - P) - / dt = o^yn^-r't). (s,i4) 


sup 

l<2,j<p—r 




t=l 


3 



























Note that 


1 " 

-E 

n. 


V2y 1/2 J 


^ rt/n 

X] / f{a)f{a)da 

^_i J(t—l)/n 


1 ^ 

-E 


n ^^ \ 

('flu 


- f{t/n) 


et 


dj-l/2 


n 3 


+ -E/'(‘/") 


g 


t=i 


dj-l/2 


n J 




/•t 

-J2 {fia)-f{t/n))f{a) + f{t/n){f{a)-f{t/n))da 

t=l d{t-l)/n 

— • dnli'i'ij) “1“ Jn2{iij) + Jnsi'^i j) • 

From the dehnition of I{d) process, it is easy to deduce that if ~ I{d) satisfying = et, 
then = X]s=i when d = 1 and when d > 2, 


«. = E 


s=l Li=l 


d-1 


]J(t-s + i)/(d- 1)! 


and P{t) can be rewritten as 

fit) = fit 
Jo 

By (|S.16p and the continuity of VF®(s), it is easy to get that 


- s 


\dj — l jjjri 


dW\s)/{di - 1)!. 


(S.15) 


(S.16) 


Thus, 


sup sup \[pia) - Pit/n)]fPa)\ = Oa.s.in n). 

(t—l)/n<a<t/n 


sup \Jn 3 ii,j)\ = Oa.s.in ^/^log^n). 

1<2 j<p—r 


(S.17) 


Set nh=i(^ — s + d)/0! = 1. Using expressions (IS.ISP and (IS.161) . we have 


* _ f* 


S/n' 


fit/n) = ^ 






S=1 


UE 


^di 1/2 — 1)! 

{t — (t/n — 


)! Jo 


It is easy to get that 


^ n<^i ^/"^{di - 1)! do {di - 1)! 
=: HPit) + KPt). 


sup sup \Hl^iit)\ = Oa.s.in ^/^logn). 

l<i<p—r l<t<n 


dW\s) 


(S.18) 


On the other hand, we have for any 1 < t < n, 

t 


K2it) = 


1 


E 


t S 


idi-^y-^^\\n 


) di — l-^ /•s/n / j. \di — l \ 

(--a) dIU*(a) 
y/n Jls-l)/n \n J / 


^ rs/n 


'U 


{di - 1)! ^ d(s-i)/ 


t s 

n n 


di-l 


-E-a 

n 


di-l 


dW\a). (S.19) 


4 























This gives 


sup sup \Hn 2 {t)\ = nOa.s{n ^''^logn) = Oa.s.{n ^'^^logn). 

l<i<p—r l<t<n 

Since the normal sequences {vl} are independent with respect to i, it follows that 

sup sup \^{\ = Oa.s.{n'^^~^^‘^logn). 

l<t<n 

Thus, by (IS.lSp . (IS.201) and (IS.211) . we have 

sup |Jni(^,j)| = Oa.s.(n"^/^log^n). 

l<z,ji<p—r 


Similarly, we have 


sup \Jn2{i,j)\ = Oa.s.in ^/^log^n). 


(S.20) 


(S.21) 


(S.22) 


(S.23) 


Using the same argument, we can show 


sup 


f' f^t)dt C nt)dt 

Jo Jo 


= Oa.s.in ^/^log^n). (S.24) 


Combining equations (|S.17p . (IS.22l) - pS.24p . we have pS.14p and conclude (7.3). The positive 
definiteness of F(t)F'(t) dt can be shown similarly to that of Lemma 3.1.1 in Chan and Wei 
(1988). □ 

Proof of Lemma 10. We first consider the case with fixed p. To this end, we split the matrix 
into three parts: the nonstationary block, the cross block with elements being the product of 
stationary component with nonstationary component and the stationary block. 

(I) As for the nonstationary block, we have for 1 < i,h < p — r, 

n—j n 

Y,i4+, - r){x1 - s'-) !i‘)(4 - 1*) 

t=l t=l 

j n-j 

= - Y.1.4 - x‘)(x^ - X>‘) - - xf) 

t=l t=l 

j n-j 

= -Y^iA-x" - PiLd,it))ix^ -x'^- PhLdhit)) - Y^^t+3 - OiLdiit))ixt+j - Xt) 

t=l t=l 

j j 

-phY^dHit)ixl - - OiLdiit)) - OiY^diit)ixt - x^ - OhLdf^it)) 

t=i t=i 

j ri-j 

~OiOh 'y ^ Ldfi it)Lidi {t) ~ Oi 'y ^ ^di (t) (Xf+j ~ Xt) 

t=l 

6 

^ ^ dnmijOT h). 


t=l 


m=l 


5 


















From (IS.41) . it follows that 


sup 

o<i<io 


\SniU,i,h)\ 


< sup ~ 

^ \l<t<n IT' 

= Op{l/n). 


di-l/2 


/ 1x1 - 

VlSIn 


As for Sn 2 {j, hh), we have 

(1) If d/i = 1, then x^_^_j — x^ = e^- Since E|e^| < oo, it follows that 


\^n2{j,i,h)\ 

sup -J —1 - 

o<i<io 


< 


( sup 

\t<n 


\xl - x^ - HiLdiit)\ 


n 


di-l/2 


Op(l/n^/2). 



n t+jo 

EE 


t=l i=t+l 



(S.26) 


(ii) If dh > 2, then x^_^_j — x^ = Ylt^t+iWsidh — 1) + V^{dh — 1)] (see Lemma 7), it follows 
that 


sup sup 

l<t<n l<j<jo 


rrt h _ /yt flj 

jldh — ^12. 


< 


JO \U^{dh-l) + Vs\dh-l)\ 

nV2 


Op(l/n^/^), 


which implies 


sup 

j<jo 


\^n2{j,i,h)\ 


< 




I sup 

\t<n 


\xl - x^ - HiLdiit)\ 


n 


di-l/2 


( sup sup 

t<n j<jo 


qfth _ fvth 

fldh —1/2 


(S.27) 


(S.28) 


Let Anm{j) = {Snm{j,i,h))(^p_r)xip-r), UT. = 1, 2, • • • ,6. Then by equations ( |S.25P - ( |S.28P , 

sup ||D~j^Bn"^(A„i(j) + A„2(j))B'D"j^||2 
i<j<io 

< (IZZ] Z] ) Op{l/n^^^) 

Vi=l 1=1 m=l / 

< + \ Op(l/n^/2) ^ Q^f^^2i^i/2^_ (g 29) 

\t=l 1=1 m=l / 

By the definition of B, it is easy to see that the elements of BA„ 3 (j), BA„ 4 (j) are zero except 
in rows X]i=i Si, J = 1, 2, • • • ,l and the non-zero elements have the following forms: 


n 


-^/^'^La^{t){xl- x^ - fliLd^it)). 


t=i 


Thus, by 


sup 

i<j<jo 


J2Li Ldh{t){x\ -x^ - /iiLdijt)) 

j^dy^-\-di — l/2 


Op(l)) 


6 






















we have 


sup IlDniBn (A„3(j) + A„4(j))B Drills = Op{pn ). 
i<i<io 


(S.30) 


Similarly, by snpi<j<j^ 


ELi Ldf^{t)Ld^ jt) 


= Op(l), we have 


sup ||D„i^Bn ^A„5(j)B'D„j^1 12 = Op(pn ^). 
l<j<jo 

Further, using (IS.7p . similar to (|S.26p and (|S.27p . we can show 

I n-j 


(S.31) 


t=i 


Thus, for A„6(j) we have 


sup IlD^^Bn ^A„60 ')B'D„j^|| 2 = Op(p/n^/^). 


(S.32) 


Combining equations (IS.29ji . (IS.30I1 . (IS.3ip and (IS.32li gives 


D~/B 


n 


n-j 


y~l(xt+j,i - Xi)(xa - Xi)' - ^(xti - Xi)(xti - Xi)' 


t=i 


t=i 


/t\ —1 
nl 


B'D 


(S.33) 


n 


- X2)(xti - Xi)'B'Dj 


= OpbVn'/'). 

(II) As for the cross block, we first show 

n 

D"iBn-i - ki)(xt2 - ^ 2 )' 

t=i 

= Op(I). 

Note that for 1 < i < p — r and p — r < h < p, 

n n n 


t=i 


(S.34) 


t=i 


t=i 


t=i 




(S.35) 


Let (^p_r)xr und ^2 = {ioffJ(^p-r)xr- Then the elements of BLii = (ejh) have the 

following expression: 

p—r n p—r 

^jh = bji Yi^t -x"- ^^iLdi{t)){4 -X^)=Y 


i=l t=l 


i=l 


By Lemma 7, we have 


e-jh 


y^di+ 1/2 


< 


1 


p—r 


f^di-\-l/2 




(S.36) 


2=1 


7 




































On the other hand, by the definition of B, the elements of ^0,2 = {djh) can be represented as 


1 , ^ 


It is easy to get that 

= Op(l). 

Consequently, by (IS.361) and (IS.371) . it follows that 


-Xi)(xt2 -X2)' 


t=l 




= Op{l). 


Similarly, 


n 


y](xt2 - X 2 )(xti - Xi)'B'D„j^ 


t=l 


= Op(l)- 


(1S.34P follows from ()S.38p and (|S.39p . 
Next, we show 


sup 

3 <30 


DnlB 


n 


“ ^l)(Xh 2 - X2)' - - Xi)(xt2 - X2)' 


\t=l 


i=l 


= Op (1) 


and 


sup 
3 <30 


~ ( - X2)(xi+j '1 - Xi)' - 5^(xt2 - X2)(xti - Xi)' j 


U=1 


t=l 


— Op (1) . 


As for (jS.dOp . note that for any l<i<p — r, p — r + l<h<p, 


1 


3n-j 


„d,+l/2 I I2(^i+3 - 

\t=l t=l 


X? - X^) 


n-3 / _ ^i\ 

^+3 t \ , h 


1 •' 

-E 

71 < ^ 


n \ n'^»“^/2 


(4 - Ex?) - 


(x^ — Ex?) 


n 


h, n-3 

di-1/2 


E 

t=i 


n' 


1 


jidi+1/2 


^ (xj-x*)(x?-x'^) 


t=n—j-\-l 

= Lin(j, i, h) + L2n(j, i, h) + LsnU, i, h). 


By (IS.27P and - Y4=i ^\4\ — 0(1), it follows that when di > 2, 

sup |Lin(i,i,/i)| = Op(l/n02). 
0<j<30 


(S.37) 

2 (S.38) 

(5.39) 

(5.40) 

(5.41) 
2 

(5.42) 

(5.43) 







































When dj = 1, by xh ■ — x\ = e^, we have 


i+io 


E sup |Li„(j,i,/i)| < mc^ —^ ^ E|e*(xJ‘- ExJ)| = 0(l/n^/^). 
o<j<jo ,r^i 

Thus, (IS.431) also holds for dj = 1. Similar to Lin{j,i, h), we have 


sup 

i<i<jo 


-\ ^ J 
i ^t+j 

rr < ^ Y)di — lj2 
t=l 




This combining with Condition 1 show 

sup \L 2 n{j,i,h)\ = - Exil] 

3 <30 ^ ' 


1 rpi _ 

i -^t+j ■^t 


-E 

71 




= Op{l/n). 


Eor L^nU^h h), by Lemma 7 and (IS.7h . we have 


sup \xl — x*|/n'^* = Op{l), 

l<t<n 


thus by X]r=n- 7 o+i E|x^|/n^/^ = 0(l/n^/^), we have 


-jo+i 

Therefore, by (IS.42P " (jS.45p . 

n-j 


sup |L 3 „(j,z,/i)| = Op(l/n^/^). 


3 <30 


1 


Sjl 


- **) - E<“^i - *')(*!■ - *'■) 


t=i t=i 

which shows pS.40p . 

For (IS.4ip , note that for any l<z<p — r, p — r + l</i<p. 


/n-j n ^ 

^ - x>‘) - Y.(< - x3)(x’; - x") 

t=i t=i / 




(S.44) 


(S.45) 


= Op(l/ni/2), (S.46) 




^(xt+j - Xt){xl - X*) + LsnU, i, h) =: L^nU, i, h) + L^nU, i, h). 

t=i 


Let L(j) = {L 4 n{j-,i, ^))(p_r)xr decompose L 4 n{j,i, h) into two terms as in (IS.35p . Using the 
same arguments as in (jS.36jl and pS.37p . we can show 


sup||n L(j)B = Op(l), 

3 <30 


(S.47) 


thus, by (IS.45p . we have pS.4ip . Combining equations pS.34p with pS.40p and (IS.411) shows that 
the cross blocks tend to 0 in probability. 
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(Ill) As for the stationary block, let TJ and Tj be the matrixes obtained by replacing the 
stationary block ^ - ^ 2 )' in and with Cov(xi+j^ 2 , x^ 2 )- (ii) of 

Condition 1, we have 

||SJ - TJII 2 = Op(l) and ||%" - t ;||2 = Op(l). (S.48) 

Thus, by (jS.33p and the fact that the cross blocks tend to 0 in probability (see (II)), we have 


D-i0„(5]J - rj)0;D-i||2 < ||D-i045])^ - T|)0;D,1||2 + ||D-i04T| - r?)0;D 


3 3 ' 


'3 


= Op{l) 


and 


|D-i0„(s" -r")0:,D-i||2 < ||D-i0„(s" - f")0 ;d“ 1||2 + ||D~i0„(T" -r")0:,D-i| 


= Op{l). 


Hence, Lemma 10 holds for finite p. 

Next, consider the case: p = o(n^/^“'^). We still split the matrix into three parts as above. 
(The nonstationary block.) Since hi < lim„_^oo Var(^”^j^ ^®/-y/ra) = cr?^ < 62 for all z, it 
follows that as n —>■ 00 , 


max 

l<t<n 


t 

Var ( zl / i/n) < 62 and Var < 62 - 


S = 1 


(S.49) 


Let Sni{j,i,h), 6 n 2 {j,i,h) be defined as above with Pi = Ph = 0. Note that the components of 
{zt} are independent, by (IS.49P and some elementary computation, we can show 

2 " 


E 


p—r 

E 


I n'^i+dh 

,h=i ” 


\Sni {j,i,h)\ 


= 0 (joP /n ) 


and 


E 


p—r 


\Sn 2 {j,i,h)\ 

I n^^i+dh 

Lh=i ^ 


E bjp 


= 0 {j^pyn 


Combining the above two equations yields 

n-j 




^(xt+j -1 - Xi)(xti - Xi)' - ^(xt4 - xi)(xti - Xi)' 


t=l 


t=l 


D 


nl 


(S.50) 


= Op{pn 2 ). 
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(The cross block.) Let LOih be defined as in (IS.351) with //j = 0. Since zt and ^t 2 are 
independent, it follows from (IS.491) that 


E 


p—r p 

[E E 

2=1 h=p—r-\-l 




p—r p 


E E 

2=1 h=p—r-\-l 

p—r p n 


E E "-^Ee 


{xl — X*) (xj, — X*) 


E[(4 - i'-)][(xf, - s'-)] 


i=\ }i=p—r-\-\ t^V=l 

( p-r p n 

E E E |EK*? - - s")]! I = 0(pVn‘-"T 

i=l h=p—r+l 


by (iii) of Condition 2, which implies 

n 

- Xl)(xt 2 - X 2 )' 


t=l 


= ||D„j^n = Op{pn 


(S.51) 


Similarly, 


n 


- X 2 )(xti - Xi)'Dj 


t=i 


= Opipn 


(S.52) 


Further, by some elementary compntation, it is easy to show 
^[Yl Y =0{pr/n^ 


-2t\ 


which gives 


sup 

j<jo 


D 


2=1 j=p—r-\-l 1=1 


-1 
Til 


n 


|^^(xt+j,l - Xi){xt+j^2 - X2)' - ^(Xfl - Xi)(xt2 - X2)' 


t=l 


(S.53) 


= Op(pn 2 +^). 


Note that L 4 „(j, i, h) = “ ELi(^i “ “ TJi=iiA+j “ Xt)xt+j, it is easy 


to show that E 


T.U ELp-.+i(^4n(j, i, h)f = 0{prln^-^-) too, thus 


sup 

i<io 


/n-j 


~ ( ~ ^2)(xt+j,l - Xl)' - ^(xi2 - X2)(xii - Xl)' j 


\t=l 


t=l 


(S.54) 


= Op{pn 2 +^). 


Consequently, by equations (jS.5ip ~ (jS.54E we get that the norms of the cross blocks are Op{pn~ 2 +'^). 

(The stationary block.) By (ii) of Condition 2, we also have (IS.48E Thus, by (IS.50p and 
the bound of the cross blocks (see above), we have if p = o(n^/^“'^), 

\\U~\^^-T^)-D~% < ||D-i(Ej-T^")D-i||2 + ||D-i(TJ-rj)D-i||2 

= Op(l) + Op(pn"^/^) = Op(l) 
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and 


iiD-^(s; - T^)T>-% < iiD-i(s; - f;)D-i|i2 + \\n-\f; - T^)n-% = o,(i). 

Hence, Lemma 10 follows. And the proof of Lemma 10 is complete. □ 


Proof of Lemma 12. By Lemma 11 and the continuous mapping theorem, it follows that (i) holds 
for j = 0. Thus, it suffices to show for any 1 < i, /i < p, 

/n—j n 

- x"){xt - x^) - - x"){a 


sup 

i<i<jo 

Observe that 

n-j 


1 


Y^di-\-dh 


- x-){Xt - X^) 


U =1 


t=l 


= Op{l). (S.55) 


- s:‘){4 - s'") - E(*< - - *'■) 


t=i 

n-j 


t=l 


- xDixt - x^) - ^ {xl-x"){xt - x^) =: aniij,i,h) + an 2 ij,i,h). 

t=l t=n—j-\-l 


By Lemma 11, it follows that for any 1 < i,h < p, 




iU\t), U^is)) on D[0, if. (S.56) 


This gives 


sup =Op{l/n). 


0 <j<jo 


Further, for any e > 0, then 


lim P{ sup sup |xT — x\\/n'^' > e} = 0. 


Thus, 


sup la„i(j,i,/i)|/n‘^‘+'^'* = Op(l). 
o<j<jo 


(S.57) 


(S.58) 


(S.59) 


Combining (IS.57P and (|S.59p gives (IS.551) as desired. 
By (i) of Condition 3, it follows that 


1 " 

-xif{xtp^ -x/i) Cov{xt+jj^xtjf 

^ i=i 

Thus, by (i) of Lemma 12, we have (7.28). 

As for (7.27), it is enough to show for any i G and h G Ii, 

^ ani{j,hh) an2{j,i,h) 1 


1 




\i+\/2 X ){X^ X ) p^di + l/2 + ndi+1/2 + „d.+l/2 ^ X ) 


2 = 1 


2 = 1 


0 , 


(S.60) 
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holds for all 0 < j < Jo- Similar to Lemma 7, we can show 


^dj+l/2 




— x")(x^ — x^) ^4 0. 


2=1 


By (|S.56P and n ^ Ylt=n-jo '^\^t ~ ^^1 = O(jon ^), we have 


sup an 2 ij,i,h)ln'^^+^^‘^ = Op{l/n). 

3 <30 


By (|S.58P and ^ Yjt=i E|x^ — = 0(1), we have 

sup = Op(l). 

3 <30 

(jS.60p follows by equations (IS.6ip - (jS.63p . 


(S.61) 


(S.62) 


(S.63) 

□ 


S.2 Proofs of Remarks 3 and 4 


Proof of Remark 3. (i) By the martingale version of the Skorokhod representation theorem (Strassen 
1967, Hall and Heyde 1980, and Wu 2007), we have for all i, on a richer probability space, there 
exists a standard Brownian motion {W{t)} and a non-negative stopping times {rj} such that for 
t > 1, 

t 

SI = W{Y,t]) and E[r/|J-i_i(i)]=E[(£j)2|J)_i(i)], (S.64) 

3 = 1 

where J-'t(i) is the u-algebra generated by {e*, s < t}. This implies that 

t 

3=1 


< E 

t 

j;(r]-E(rj|7-i_i(i)) 

+ E 

t 

+ E 

t 


3=1 


3=1 


3=1 


Since both {rj — E(rj| 77 - 1 ( 1 ))} and {(e))^ — E((e*)^|7f_i(i))} are martingale difference and 
E|eJ|'? < 00 , it follows that 


sup E 

l<t<n 


^(r}-E(T]|77_i(i)) 

3=1 


O 



E14 - E{T'|/',_i(i))] 


3=1 




Similarly, supi<<<„ E = ©(n^/-?*). Further, condition E ELJ44 

= implies that supi<^<„E |^^^^[(ej)^ — (7?J| = Thus, Condition 2(i) 


-crt 


holds for any r > 1/q*. If p = o(n^/^). Condition 2(ii) holds. Since the components of £t are 
independent. Condition 2(iii) follows with sup^ = 0{n). 
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(ii) By the proof of Theorem 9.3.1 of Lin and Lu (1996), we know that there exists a martingale 
difference sequence {m\} such that Rt = S^ — Ml satisfying E\Rt\‘^ = 0(1), where Ml = Yl]=i 
Further, 


E 


- E(m5)2] 


1 = 1 


9/2 


< On log n. 


(S.65) 


As a result, Condition 2(i) holds for any r > 1/q. Similarly, Condition 2(iii) can be easily obtained 
by basic inequality for mixing processes, see Lemma 1.2.2 of Lin and Lu (1996). Note that for 
any given j, 

^ n-j 

- y~](Xf+j^2 - X2)(xt2 - X2)' - Cov(xi+j,2,Xi,2) ^ 


E 


t=l 


- E f-^[(x)-x*)(x^ - x^) - Cov(x),x^)] j =O(p2/n)^0 

ij=p—r-\-l \ t=l / 


as p = o(n^/^). Condition 2(ii) holds too. 

(iii) By Beveridge-Nelson decomposition, e\ can be represented as 


e, = 


- {et - €t-i), et = '^ 


1=0 


^ij ~ 


1=0 


E 

h=j+l 


Ch- 


Let R\ = SI - [ Cii ) ^ 7=1 Vj = et-€o. Then 


sup 

l<t<n 


Sl- 




= 0 ( 1 ) 


(S.66) 


1=0 j=l 

Since {rjl} is an i.i.d sequence with E|77*|i < 00 , by Theorem 4.3 of Strassen (1967), the stopping 
{tI} dehned as in (IS.64P is an independent sequence with Er^ = Rj{r]j)'^ and < 00 . Thus, 

supi< 7 <„E| Yfj=iiT'j ~ = 0(n + n^O). As a result, we have for go = min{g,4}, 

t 

sup E|^[rj-E(r?i) 2 ]|= 0 (n 20 o). 


l<t<n 


1=1 


Let tti = E(jf-Y, then on a richer space there exist a standard Brownian motion W{t) such that 


E(^r?* - W{ait)f = 0(n2/5°). 
1=1 


(S.67) 


Thus, bv (IS.661) . (IS.671) . Condition 2fil holds for r = 1 /on. It is easv to show that sud E |E(4e|)l 

= 0{n). Thus, Condition 2(iii) follows by the independence of the components. Condition 2(ii) 
can be shown in Remark 3(ii). □ 
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Proof of Remark 4- It is easy to get A* = Oe(l) when p — r is fixed. We only show the case 
when m := p — r —)> oo as n —)• oo. Let d = dminj b® -1(1) process defined as in Lemma 9, 
and e = (1, • • • , 1)' be two n dimensional vectors. Let E„ and II„ be n x n 

matrices given by 


E — T 

— -‘-n 

then for any 1 < i,j < m, 

e 


-ee^ and n„ = 

n 


( 10 0 
-1 1 0 


0 ... 


0^ 

0 


y 0 0 0 ••• -1 ly 


= (ei,-- - = <y^in-^{v\,vl,■■■ ,viy and 


Let 5i < • • • < (5n and 71 > • • • > 7 n be the eigenvalues of and (n„'^)'E„E^n„'^ respec¬ 

tively. Since Ai(E„E^) = • • • = An_i(E„E^) = 1, by Theorem 9 of Merikoski and Kumar (2004), 
it follows that 


j-y', = A,+i((n-'=')'n-'')A„_i(E„E;) < 7 , < A,((n-'=')'n-'')Ai(E„E;) = 6 -y (s.es) 

Further, 6 k = 2 — 2 cos(2A:7r/(2n -|- 1)), k = 1,2, ■ ■ ■ ,n (see Yueh (2005)), which implies 

(5fc ~ 4(/i:7r/(2n-|-1))^, ask/n^O. (S.69) 

Let U be an orthogonal matrix with row vectors ui, • • • , u„ such that (n„'^)'E„E(jn„'^ = 
Udiag( 7 i,--- , 7 „)U' and let Q = (V^,--- ,V™'). For x G T?.™ with x'x = 1, define Ufix = 
(^ix, • • • ) bnx)' = bx £ By (jS.OSp . we have 


A 


'min ^2d 




t=l 


= ,r-~cy{e-tr-- ,r-~e 

> {min((Tii)}^min^x'(Uri)'diag(7i,--- ,7n)(Ufi)x 

i X 77,^“ 

r ■ / n-, 2 ■ (bybx)b;,(7i,--- ,7„)bx 

= {mm(o-n)} mm-- 

i X 




> 


> 


/ k n 

{min((Tii)}^n^“^'^Amin(fi'fi/n) min I 


v/=i 


1=1 


k 

{mm{aii)}^{k/n‘^'^)[Xjnin{^'^/n)/Xniax{ft'n/ri)]6^^^ m^n ^ ^ 

Oe(fc'-“)A„.in{[l^'(u(,... ,u',)][(u;,... ,niyn]}. 


(S.70) 
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where Amin, Amax denote the smallest and largest eigenvalues of a matrix respectively, the last 
equation follows by ()S.69|) and the fact that there exist two positive constants Ci,C 2 such that 
Cl < Amin(^^^^^/?T-) < ft/n) < C 2 in probability when —)• 0. Since U is orthogonal 

and the elements of Ct are independent standard normal variables, it follows that the elements 
of are independent and standard normal variables, thus by Theorem 2 of Bai and 

Yin (1993), we have if m/k £ (0,1), 

Amin{[f^'(ui,--- ,u'fc)][(u),--- ,u'fc)'n]} = (1 -^Jmlkf, a.s. (S.71) 

Taking k = 2m, then by (IS.70p and ()S.7ip . we have 

^ (S.72) 

Since “ ^') ~ /o^ F(t)F'(t) (it ||2 = Op(l) (see Lemma 9), Remark 4 follows 

from (IST^ . □ 
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